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(a) loading a first vector into f first register, said first vector comprising a 
plurality of N-bit elements; 

(b) loading a second vector /into a second register, said second vector 
comprising a plurality of N-bit elements; 

(c) executing an arithmetic instruction for at least one pair consisting of an 
N-bit element in said first register and an N-bif element in said second register, to produce a 
resulting element; 

(d) writing said resulting element into an M-bit element of an accumulator, 
wherein M is greater than N; 

(e) transforming said resulting element in said accumulator into a width of 

N-bits; and 

(f) writing said resulting ^lement into a third register. 

#1. The method as recited in claim <kf , wh srein said accumulator comprises a plurality of M- 
bit elements and wherein steps (c)-(f) operate on a plurality of elements of said first and second 
vectors to produce a resultant vector formed from a plurality of resulting elements written to said 
third register. 

The method as recited in claim fu rther comprising a step before step (c) of: 



selecting an element from saic 
copying said element into all 




second register; and 
ther elementsm^aid second register. 



J 9~ 

The method as recited in claim 42, fun her comprising^* step before step (f) of: 

selecting a subset of said resultir g elements in said accumulator for writing to said 
third register, said subset being chosen from an y one of: the low third bits, the middle third bits, 
and the high third bits of said resulting elemen s in said accumulator. 



is. 



9- 

The method as recited in claim 42f, whei ein M is equal to three times N. 



1 The method as recited in claim 0$, wherein N is equal to eight or sixteen. 
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^ 4^ The method as recited in claim 42f wherein said resulting elements in said accumulator 
are wrapped around the representable range of said resulting elements. 

The method as recited in claim ^ further comprising a step before step (f) of: 

dividing said resulting elements stored in said accumulator into a plurality of 

subsets; 

writing each subset to at least one of a plurality^f registers, each of said plurality 
of registers having a width smaller than said accumulator *vidth. 

The method as recited in claim 4f , wherein sdj/a loading step (a) and said loading step (b) 
are not formatted. 

' 5# The method as recited in claim M, rarther comprising a step before step (d) of: 

formatting said resulting ^foment as specified in said arithmetic instruction. 

Il / ' 

y(. The method as recited in claim d^f, wherein said arithmetic instruction is any one of: 
addition, multiplication and subtraction. 

The method as recitedyfn claim 4*, wherein step (e) comprises the steps of: 

shifting saidyresulting element in said accumulator for scaling the value of said 
resulting element; 

rounding said resulting element; and 
clamping said resulting element. 




I* 



5%. The method as recited in claim wherein said rounding step comprises one of: 
rounding said resulting element towards zero; 

id resulting element towards the nearest unit, wherein said resulting 
from zero if said resulting element is at least halfway towards the 



unit. 



rounding sqi 
element is rounded away 
nearest unit; and 

rounding siid resulting element towards the nearest unit, wherein said resulting 
element is rounded to ware s zero if said resulting element is at least halfway towards the nearest 

C/ 




1 ' 
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1 54*. The method as recited in claim £i 9 furthey comprising a step before step (d) of: 

2 adding an element previously storgd in said accumulator to said resulting element. 

1 1 The method as recited in claim ^kf, wherein N is any one of: eight, sixteen, thirty-two and 

2 sixty-four. 



1 5<£ The method as recited in claim^5, wMerein said N-bit elements are integers. 
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¥T. The method as recited in claim 5o,/vherein each of said first and second vectors has a 
width of 64 bits. 

S€. The method as recited in claim^Y, wherein said accumulator is a register having a width 
equal to an integer multiple of 64 bits. 

I*\ hi 

$9. The method as recited in clain/5#, wherein said accumulator is a register having a width 
of 192 bits. 




1? * 

6rf). The method as recited in clAim^f, whereinsaid first register, said second register, and 

said third register are floating poi^t registers. 

3J / ' 

6*f. The method as recited in claim 0, wherein said first register, said second register, and 

said third register each have a wjdth of 64-bits. 

62: A processor for providing extended precision in single instruction multiple data (SIMD) 
arithmetic operations, comprisi ag: 

means for executing an arithmetic instruction involving an element of a first 
vector and an element of a sec md vector to produce a resulting element, said first and second 

N-bit elements; 

for receiving said resulting element, wherein said resulting 
ement of said accumulator and wherein M is greater than N; 



vector comprising a plurality of 

an accumulator 
element is stored in an M-bit e 
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means for transforming said resulting element in said/accumulator into a width 



of N-bits; and 

means for writing said transformed resulting element to a register. 

The processor as recited in claim 62, wherein said/accumulator comprises a plurality of 
M-bit elements and wherein said means for executingis repeated for said plurality of elements 
of said first and second vectors to produce a plurality of resulting elements that are received by 
said accumulator and wherein said means for transforming and said means for writing are 
performed on said plurality of resulting elements. 



£4. 




The processor as recited in claim/®?, wherein means for writing comprises: 

selecting a subset of saiefresulting elements in said accumulator for writing to said 
register, said subset being chosen from any one of: the low third bits, the middle third bits, and 
the high third bits of said resulting elements in said accumulator. 

is> / 

1 "&5. The processor as redred in claim wherein M is equal to three times N. 



The processor a^recited in claim 65, wherein N is equal to eight or sixteen. 




&f. The system as reftited in claim &5, wherein said resulting elements in said accumulator 
are wrapped around the rapresentable range of said resulting elements. 



1 

2 
3 
4 
5 

1 

2 
3 



■ ra^r 



The system as recited in claim further comprising: 

dividing said resulting elements/ stored in said accumulator into a plurality of 

subsets; 

writing each subset to at least orfe of a plurality of registers, each of said plurality 
of registers having a width smaller than said ^jccuqiulator width. 

^Gy. The system as recited in claim 62, fumher comprising: 

means for formatting said resulting element in said accumulator as specified in 
said arithmetic instruction. 
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" ^6. The processor as recited in claim wherein said arithmetic instruction is any one of: 



addition, multiplication and subtraction. 



The processor as recited in claim £2, wherein means for transforming comprises: 

means for shining said resulting element in said accumulator for scaling the value 
of said resulting element; / 

means for rounding said resulting element; and 
means for clamping said resulting element. 

The processor as recited in claim 74" , wherein said rounding means comprises one of: 
means for rolunding said resulting element towards zero; 

means for rounding said resulting element towards the nearest unit, wherein said 
resulting element is rounded a^vay from zero if said resulting element is at least halfway towards 
the nearest unit; and 

means for rounding said resulting element towards the nearest unit, wherein said 
resulting element is rounded townards zero if said resulting element is at least halfway towards 
the nearest unit. 

^ / ^ 

The processor as recited in claim further comprising: 

means for adding an element previously stored in said accumulator to said 
resulting element, upon reception of said resulting element by said accumulator. 

recited in claim &Z., wherein N is any one of: eight, sixteen, thirty-two 



1/ 




The processor as 
and sixty-four. 



75. The processor a| recited in claim wherein said N-bit elements are integers . 

>6. The processor a^ recited in claim 74, wherein each of said first and said second vectors 
has a width of 64 bits. 



Trf. The processor as 
width equal to an integer 
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recited in claim J76, wherein said accumulator is a register having a 
multiple of 64 bits. 



